LaTeX Pet peeves

Here are some of my very personal LaTeX pet peeves. Not being aware of these will probably not make any difference in your life unless you’re working with me. πŸ‘€

Not using the tilde (~), the “glue”

The tilde ~ is a non-breaking space, or you can think of it as a transparent ‘glue’ that occupies a little bit of space. It prevents LaTeX from inserting a line break at that position. If you don’t have a space, the citations become too close to the last word. A space without tilde can end up with citations, references, or numbers orphaned at the start of a line, which looks awkward (to me).

πŸ™‚β€β†”οΈ

as shown by Author\cite{author2024}.
as shown by Author \cite{author2024}.
Dr. Smith ...
see Figure \ref{fig:results}

πŸ™‚β€β†•οΈ

as shown by Author~\cite{author2024}.
Dr.~Smith ...
see Figure~\ref{fig:results}

When to use it

The most common usages of ~ are:

The general rule: if a line break between two elements would look wrong, use ~.

When NOT to use it

Unlike \cite{}, in most cases, footnotes should not have a tilde before them. The footnote mark is a superscript that attaches directly to the preceding word or punctuationβ€”adding a space would create an awkward gap.

πŸ™‚β€β†”οΈ

This is important~\footnote{...} because ...

πŸ™‚β€β†•οΈ

This is important\footnote{...} and the sentence continues.
This is important.\footnote{...}

At the end of a sentence, the footnote typically goes after the period (American style). Some style guides place it before, so check your target venueβ€”but never with a tilde either way.

Hyphen, n-dash, m-dash, and minus sign

There are specific use cases for each of the dashes and you want to use them appropriately. You can look up how they should be used, but here’s a summary:

Name LaTeX Use
Hyphen - Compound words, line breaks
En-dash -- Ranges, connections
Em-dash --- Parenthetical breaks

We don’t have to use en-dash much, so em-dash is usually my problem. Em-dash is used for parenthetical statements. The thing about em-dash is that it’s visually much more distinct than using a comma while not as abrupt as parentheses. In LaTeX, I think em-dash without any space creates just the right amount of spacing; having space around em-dash makes the phrases too far apart (all to my eyes).

πŸ™‚β€β†”οΈ

The results - surprising as they were - confirmed our hypothesis.
The results --- surprising as they were --- confirmed our hypothesis.

πŸ™‚β€β†•οΈ

The results---surprising as they were---confirmed our hypothesis.

And, while we’re at it… a hyphen is also not a minus sign.

πŸ™‚β€β†”οΈ

The temperature was -10 degrees.

πŸ™‚β€β†•οΈ

The temperature was $-10$ degrees.

Mixing these symbols is not the end of the world (and there are multiple conflicting style guides) but I think it’s still good to follow these conventions (and looks nicer).

Not using \text{} or \operatorname{} for non-variable words

When a word or abbreviation in math mode represents a label rather than a product of variables, it should be set in upright text. Without it, LaTeX treats each letter as a separate italic variable, producing wrong (IMO) spacing and wrong (IMO) semantics. Depending on the fonts, this can produce really ugly equations.

This applies to subscripts:

πŸ™‚β€β†”οΈ

$P_{data}$
$E_{effective}$
$\tau_{diff}$

This produces , , , and so on.

πŸ™‚β€β†•οΈ

$P_{\text{data}}$
$E_{\text{effective}}$
$\tau_{\text{diff}}$

Which produce: , , and .

This also applies to custom-defined symbols and functions:

πŸ™‚β€β†”οΈ

$PMI(x, y)$
$Diff(A, B)$

This produces and . See the awkward spacing around the letters… 🫣

πŸ™‚β€β†•οΈ

$\operatorname{PMI}(x, y)$
$\operatorname{Diff}(A, B)$

Which produce: and . Just look at how nicer and correct they look! So use \operatorname{} for function-like symbols and \text{} for subscript labels.

Also note that how bad it looks depends on the font; letter combinations like “ff”, “fi”, “fl” are ligatures in text mode (joined into a single glyph) but math mode disables ligatures because each letter is treated as a separate variable. This means “ff” renders as two separate italic f‘s with a visible gap. Letters like “t” can also look awkward. The effect ranges from subtle (e.g., Computer Modern on screen) to quite ugly (e.g., Times-based fonts).

Not using predefined math operators

LaTeX provides predefined commands for standard functions like \log, \exp, \sin, \max, \min, \det, \lim, \Pr, \arg, etc. These are set in upright roman type with proper spacing, as they should be. Writing them as plain letters produces italicized, poorly spaced versions.

πŸ™‚β€β†”οΈ

$log(x)$
$argmax_{\theta}$
$Pr(A | B)$

, ,

πŸ™‚β€β†•οΈ

$\log(x)$
$\operatorname*{argmax}_{\theta}$
$\Pr(A \mid B)$

, ,

If a standard operator isn’t predefined (e.g., argmax), use \operatorname{argmax} or \operatorname*{argmax} (the * version allows limits to be placed above/below in display mode). You can also define your own with \DeclareMathOperator in the preamble.

Quotation marks

LaTeX doesn’t understand straight quotes (") natively. Some editors auto-insert curly “fancy” quotes when you type the quotes, and actually some LaTeX compilers may handle them properly (with the right encoding/font setup). But, this is not reliable 🫣.

For proper typeset quotation marks, use backticks for opening and apostrophes for closing.

πŸ™‚β€β†”οΈ

She said "the word 'glyph' is lovely" to her friend.

πŸ™‚β€β†•οΈ

She said ``the word `glyph' is lovely'' to her friend.

One sentence per line

LaTeX treats a single newline as a space, so how you break lines in your source has zero effect on the output. This means you can β€” and IMO should β€” use line breaks to reflect the logical structure of your writing rather than mimicking how it looks on paper.

The practice is called semantic line breaks and goes back to Kernighan’s “UNIX for Beginners” (1974), where he advised: “Start each sentence on a new line. Make lines short, and break lines at natural places, such as after commas and semicolons, rather than randomly.”

πŸ™‚β€β†”οΈ

In this paper, we propose a novel method for estimating causal effects from observational data. Our approach builds on recent advances in representation learning. We evaluate on three benchmark datasets and show significant improvements over existing baselines.

πŸ™‚β€β†•οΈ

In this paper, we propose a novel method for estimating causal effects from observational data.
Our approach builds on recent advances in representation learning.
We evaluate on three benchmark datasets and show significant improvements over existing baselines.

Both produce identical output. But the second version gives you:

The counterargument is that the source looks ragged without soft wrap. But that’s an editor setting, not a document problem β€” most editors can soft-wrap without any issues. And the whole point of LaTeX is to separate meaning from presentation. We write \section{} instead of manually making text big and bold; one-sentence-per-line is the same philosophy applied to the source itself.